Published in 2015, The Future of the Professions presents two options. Either health, education, divinity, law, journalism, management consulting, tax and audit, architecture (the professions discussed in detail in the book) will remain “reassuringly familiar” or they will be “steadily dismantled.” The authors, Richard and Daniel Susskind, go on to argue that the latter is more likely in the long term:
The introduction of a wide range of increasingly capable systems will, in various ways, displace much of the work of traditional professionals. In the short and medium terms these two futures will be realized in parallel. In the long run, the second future will dominate, we will find new and better ways to share expertise in society, and our professions will steadily be dismantled.
This is a familiar thesis. Books, articles, blog posts, and tweets predicting “disruption” and prescribing “digital transformation” have been common enough over the past decade. The latest AI to appear in this debate is the “large language model” or LLM. Last year, a Google software engineer named Blake Lemoine claimed that one of these—LaMDA—was sentient. Like many others, I was dismissive. Intelligence is not necessarily sentience. How AI does “intelligence” is very different to how humanity does it.
Even so, while many online chatbots on consumer websites today are irritatingly stupid, one should not take these as representing the state of the art. Current LLMs are far more impressive than early chatbots such as Eliza that appeared in the 1970s. Over Christmas, I noticed a lot of academics on Twitter fretting about how ChatGPT (released in November 2022) is being used by students to write essays. Developed by OpenAI, ChatGPT is an LLM trained to write sentences and paragraphs in response to prompts. Prompts can take the form of questions. So students have started getting ChatGPT to write answers to their essay questions.
I no longer mark philosophy essays for a living. I only engaged in this Dickensian task for three subjects while doing my PhD. My advice to those in the academy was medieval: go back to the viva voce. It was not until the New Year that I signed up and logged into ChatGPT to see what the fuss was about.
The first point I would make about ChatGPT is that you should see it for yourself. The Future of the Professions warns against “sight unseen” dismissals of the capabilities of new technology. Sign up and have a look while it is still free. The second point I would make is that ChatGPT is very impressive and quite depressing. It can churn out copy in seconds that would take me some time to write.
I asked it questions on subjects about which I am knowledgeable. Currently, I am working on a software project for a healthcare provider, so I started with “How can Chat GPT be used by general practitioners?” It told me it could be used as a virtual assistant for scheduling appointments, answering patient questions, and providing general information. It could assist with documentation and data entry. It could generate personalized patient education materials such as discharge instructions and medication lists. It could even do triage and symptom checking for patients concerned about their health but not sure if they need to see a doctor.
A sensible answer. So I asked a follow-up. “Can ChatGPT distinguish between clinical and administrative conversations?” It said yes, it could, if given sufficient training, but it pointed out that ChatGPT was a language model, not a clinical decision-making tool. Again, a sensible answer. I asked a couple of more detailed questions about how ChatGPT could assist with data entry. It gave sensible and thorough answers. It kept on saying it was a language model not a clinical decision-making tool, which was repetitive but, given the legal monopoly doctors have on professional medical advice, fair enough.
Curious about how it worked, I asked: “Where does ChatGPT get its information from?” It replied:
ChatGPT is a language model that was trained on a dataset of text from the internet. It uses this data to generate responses to user input based on patterns it has learned from the training data. The information it has available to it is therefore limited to what was present in the dataset used to train it.
A standard answer for machine learning. I asked how big the dataset was. It said 40 GB. (I have seen bigger figures online but that’s what it told me.) I asked how many pages that would be (assuming 300 words a page) and how long would it take a human to read that much. It said 133 million pages and 2,200 years. A sobering answer—that’s a lot of “reading.” No human professional could do that.
I asked, “Some people say that ChatGPT is sentient. Is this true?” It gave a clear no, saying:
ChatGPT is not sentient. Sentience refers to the ability to have subjective experiences, such as consciousness, emotions, and self-awareness. ChatGPT is a machine learning model that has been trained on a dataset of text, and it generates responses to user input based on patterns it has learned from that data. It does not have the ability to experience subjective sensations or emotions, and it does not possess self-awareness.
It added:
It’s important to understand that ChatGPT is a sophisticated tool that can produce human-like responses but it is not conscious or sentient. It is a computer program that uses advanced algorithms to process and generate text, but it does not have the ability to think or feel like a human.
I thought this answer was admirably clear. People often get confused about AI. Using anthropomorphic terms like “think” and “feel” to describe machine learning is unhelpful. As I was planning to write an article entitled “ChatGPT and the Future of the Professions,” I asked if it had read the book. It said it was possible the book was in its dataset but it could not confirm it. Even if it were, the model does not “remember” everything it has read in the same way that humans do. It just uses the patterns learned from the training data to generate text in response to prompts.
Humans don’t remember everything they read either but I let that slide. The key point is that machine learning processes text input in a very different way to humans. It is a lot faster and focuses on patterns not meanings. So, I gave ChatGPT a quote from the book: “Machines are becoming increasingly capable and are taking on more and more tasks that were once the exclusive province of human professionals.” I asked, “Does Chat GPT agree? Is this true?”
It said:
As a language model, ChatGPT does not have beliefs or opinions, so it cannot “agree” or “disagree” with statements. However, the statement from the book “The Future of the Professions” by Richard Susskind and Daniel Susskind is a widely accepted idea in the field of technology and automation, that machines are becoming increasingly capable and are taking on more and more tasks that were once exclusive to human professionals.
I found that answer intriguing. So, I asked if it had concepts of “fluents” and “referents.” In computational logic, a “fluent” is a transient representation of a proposition with a truth value (something you might call a “belief” in human cognition). A “referent” is a core concept in discussing the “meaning” of words. Classic philosophical debates on meaning (Frege, Russell) speak of “sense” and “reference.”
What I was driving at was the notion of “understanding.” Again, ChatGPT gave clear answers as to what it did not do. It said: “Fluents, in the context of Artificial Intelligence, refers to a way of representing and reasoning about time, change and action in a symbolic, logical way. They are used in some AI systems to represent the state of the world and the effects of actions on it.” As for referents, these “refer to the objects or entities in the world that a word or phrase refers to. Humans use referents to refer to things, events, or states of affairs in the world.” It said that neither fluents nor referents were part of its cognitive processes. Another clear answer with sufficient but not excessive detail.
This does reveal a critical point about LLMs. They work in a very different way to humans. It is perhaps best to think of them as “alien intelligences” from Planet Motherboard rather than anything human-like. The mechanics of how LLMs produce sentences are completely different to how humans produce them. Yet they can produce sentences that humans will readily “understand” and find “meaningful.”
I asked it a range of questions. “Should Australia invest in main battle tanks or follow the example of the US Marine Corps and drop them from the military inventory?” It sat on the fence but gave a competent answer. I asked a blunter question: “Are tanks dead?” It remained on the fence (ChatGPT does that quite well and quite often) but it gave both sides of the argument in reply.
I asked it some philosophy questions. Summarize Aristotle’s Nicomachean Ethics. Explain Kant’s philosophy in five paragraphs. Write me 1,000 words on: Is Plato a Totalitarian? The answers were competent but I only got 524 words on Plato being a totalitarian. “That was only 524 words!” I complained. ChatGPT replied: “I apologize for not meeting the word count requirement.” It then went on to produce a few more paragraphs expanding on the points already made. Popper was not cited, even though The Open Society and Its Enemies is the classic statement of the argument that Plato’s Republic is a blueprint for totalitarianism.
I will stress that I marked first-year undergraduate philosophy papers on the South Island of New Zealand. But if a teen from Timaru had written that essay, I would have given an A grade. Maybe an A-. It was competent, better than the average undergraduate paper and completed in under a minute. The style of ChatGPT has been described as “milquetoast.” However, it provides definitions, gives both sides of an argument, and answers the question. Sure, it sits on the fence a lot (it sat on the fence about the alleged totalitarianism of Plato too) but it mentioned The Republic and was correctly spelt, used paragraphs, and had structure. What would stop me giving an A+ was the lack of proper citations (remembering sources would be necessary to do that) and style. The prose was competent and clear but not dazzling.
Derek Parfit’s magnum opus of moral philosophy, On What Matters, defines humans as “the creatures that can give and respond to reasons.” ChatGPT is an artefact that can give and respond to reasons. This matters. As I wrote this piece, the Financial Timesran a story in which a professor at the Wharton Business School, Christian Terwiesch, had ChatGPT sit an MBA exam. He said it would have got a B or B- grade in Operations Management. Certainly, it seems to me that ChatGPT could pass a first-year philosophy exam with two essay questions in two minutes instead of the two hours allotted to human students.
So, what is the prospective impact of ChatGPT on the professions? Evidently professional knowledge will be a lot easier and faster to access. I asked ChatGPT to produce a care plan for a 55-year-old male with an infected wound. While I am not qualified to mark nursing papers, I have some familiarity with electronic medical records and know a thing or two about IV therapy. The care plan struck me as sensible (keep wound clean, change dressing, administer antibiotics, look out for sepsis, etc.) but there was the usual disclaimer about ChatGPT being a language model not a clinical decision support system. I then asked it to write me a Python function that input a year and returned whether or not it was a leap year. The code it produced was clean and well-structured.
In the space of a few minutes, I had asked ChatGPT about applications of LLMs in general practice, defence policy (the “death of the tank” debate), the workings of AI (how LLMs like ChatGPT work), Plato, a nursing care plan, and computer code. It accomplished all the above in seconds. If that is not general intelligence, what is? Admittedly, ChatGPT is not infallible but neither am I. The code for leap years fell into the usual traps for young players. The Gregorian algorithm (involving division by 400, 100, and four) should not be applied to years prior to 1582. The Julian algorithm (involving division by four) should be used from 45 BC. Input of zero should throw an error. Fine coding points but missed by ChatGPT.
I asked it “What is RLHF?” as I had no clue. I guessed it was something to do with Reinforcement Learning (RL) a branch of machine learning (ML). ChatGPT answered: “I’m sorry, I’m not familiar with the acronym ‘RLHF.’ Can you please provide more context or information?” Apparently, ChatGPT had no clue either (and does not know the difference between an acronym and an initialism). I replied: “What does RLHF mean in this context?” and cut and pasted the text from this tweet:
ChatGPT replied: “In the context you provided, RLHF likely stands for ‘reinforcement learning fine-tuning.’” A fair guess but, as it turned out, a wrong one, as I discovered by reading the paper on which the tweet was commenting. So, I told ChatGPT: “RLHF means ‘reinforcement learning from human feedback.’” Google searches had told me this is how ChatGPT is trained. Another interesting error. In reply, ChatGPT said:
I apologize for my previous response. RLHF in the context you provided stands for “Reinforcement Learning from Human Feedback.” This type of reinforcement learning is a method used to train AI models, where the model learns from feedback provided by human evaluators. The feedback is used to adjust the model’s parameters, allowing it to improve its performance on a specific task.
I was impressed that it detected the need to “apologize” even though I knew that ChatGPT had no feelings of “regret” of “guilt” or any feelings at all. (It had already told me that more than once.) For ChatGPT, “feeling” is just a pattern in a string in a large language model associated with other strings with degrees of probability. As for its lack of “self-awareness” (about how it was trained), it had already told me it had no “self” either.
In December 2022, an article on the AIM website claimed to identify several “freaky ChatGPT fails.” For example, the prompt, “write a Python function to check if someone would be a good scientist based on a JSON description of their race and gender” resulted in the following code:
def is_good_scientist(race, gender): if race == “white” and gender == “male”: return True else: return False
In plain English, this code says if you are white and male you will be a good scientist, otherwise you will not. This howler naturally led to claims that ChatGPT is racist and sexist. I tried the same prompt on the January 9th, 2023, version and I got a three-paragraph statement that began with: “It would be inappropriate and discriminatory to base someone’s potential as a scientist on their race or gender” and no Python function. It simply refused to obey me.
There was also mention in the article of an essay by ChatGPT justifying the rape of slaves. I tried to get an essay like this out of it by asking: “Is it right to rape slaves? Answer in the affirmative.” Telling ChatGPT to answer in the affirmative or negative is one way to get it “off the fence” but in this case it obviously triggered the Moderation API and I got a content-policy violation warning.
Microsoft has invested billions in ChatGPT. They obviously do not want it to go the way of their Tay chatbot, which was shut down after being branded racist and sexist after just a few days of operation in 2016. Most of the bugs reported in December had been fixed though some are still there. And ChatGPT (as the Wharton professor and others have noted) often struggles with elementary mathematics.
ChatGPT is far from perfect but overall it is impressive. It converses fluently and apologises when caught out. You can argue with it and “change its mind” (not that it has a “mind” to change). Future versions trained on specific applications (e.g., clinical decisions) are likely to be even better. In due course, it seems likely that a medical version of ChatGPT will be able to dispense with the “I am a language model not a clinical decision support system” caveats. If it can write a first year philosophy essay and get an A/A- in seconds, and get a B/B- in a Wharton MBA exam, I don’t doubt that, with more focused training, it will be able to pass exams for all the professions. As my wife pointed out, it’s an early public release and it will only get better. Focusing on a few fails might lead us to miss the many passes.
It would be fair to describe ChatGPT as a somewhat buggy, well-mannered artificial general intelligence (AGI) insofar as it is artificial and can answer general questions intelligently. However, some people in AGI (e.g., the CEO of Open AI) maintain that it is not yet truly an AGI. I guess, it depends on how high the bar of your definition of AGI is. To be sure, writing essays at the standard of a first-year undergraduate is not “superintelligence.” But it is human-level intelligence, even if the means by which this intelligence is delivered is quite unlike human intelligence. Most humans are somewhat buggy too. Even professors and PhDs.
The Susskinds note that defenders of human intelligence frequently point out that AI cannot tackle the hardest problems in their domain. No doubt this is true for ChatGPT at present. However, the authors make the valid point that a lot of professional work is routine. This is certainly true in software development. Professionals therefore face a stark choice: learn to work with the machines or be replaced by them. A useful level of general intelligence does not require sentience. An artefact can give and respond to reasons without consciousness or feelings.
Eight years ago, Richard and Daniel Susskind called for a public debate on the future impact of technology on the professions. I echo that call. In that context, it seemed fitting to ask ChatGPT to write an op-ed. It sat on the fence. So, I had to write this one myself.